3 research outputs found

    A review of Yorùbá Automatic Speech Recognition

    Get PDF
    Automatic Speech Recognition (ASR) has recorded appreciable progress both in technology and application.Despite this progress, there still exist wide performance gap between human speech recognition (HSR) and ASR which has inhibited its full adoption in real life situation.A brief review of research progress on Yorùbá Automatic Speech Recognition (ASR) is presented in this paper focusing of variability as factor contributing to performance gap between HSR and ASR with a view of x-raying the advances recorded, major obstacles, and chart a way forward for development of ASR for Yorùbá that is comparable to those of other tone languages and of developed nations.This is done through extensive surveys of literatures on ASR with focus on Yorùbá.Though appreciable progress has been recorded in advancement of ASR in the developed world, reverse is the case for most of the developing nations especially those of Africa.Yorùbá like most of languages in Africa lacks both human and materials resources needed for the development of functional ASR system much less taking advantage of its potentials benefits. Results reveal that attaining an ultimate goal of ASR performance comparable to human level requires deep understanding of variability factors

    Accent identification of Malaysian and Nigerian English based on acoustic features

    Get PDF
    Purpose - This paper studies acoustics features of energy, pitch and formants of Malaysian and Nigerian English vowels with the aim of effective accents identification using multi liner regression (MLR) and linear discriminant analysis (LDA) classifiers for performance improvement of ASR when exposed to accented speech.Accent being a foremost source of ASR performance degradation has received a great attention from ASR researchers.Majority of ASR applications were developed with native English speakers speech samples without considering fact that most of its potential users speaks English as a second language with a marked accent, hence its poor performance when exposed to accented speech. Previous studies on accent has shown that the ability to correctly recognized accent has greatly enhanced the recognition performance of ASR when exposed to accented speech data.In a study of 14 regional accents of British, (Hanani, Russell, & Carey, 2013) achieved a performance increase of 5.58%.A study by (Vergyri, Lamel, & Gauvain, 2010) using six different regional accented English shows an average of 41.43% WER.Which was reduced to 27% on the incorporation of accent identification module.Several studies have explored several acoustic features of speech such as energy, pitch, formants, MFCC, and LPC to establish the differences between regional or cross ethnics accent aimed at better understanding of the differences in the acoustic features to enhance ASR performance.Apparently from the previous studies reviewed above, it is evident that accent constitute a hurdle to the performance of ASR. Hence, consequently serves as a barrier to ASR wide reception and usage in real life situations. Consequently, it becomes pertinent that accent should be given adequate research attention with the view of enhancing ASR performance to accented speech which will inherently promotes its wide acceptability and applicability globally

    Double-stage features extraction for Malays vowel classification using multinomial logistic regression

    Get PDF
    Automatic speech recognition (ASR) has recorded enormous development in both research and implementation such as voice commands to control electronic appliances, video games, interface to voice dictation, assistive leaving for the elderly, and dialogue systems. Rapid development on ASR can be seen on the English language, while duplicating the ASR framework for Malay language is possible, but the work demands endlessly efforts. One of common tools that is able to classify Malay vowels is Multinomial Logistic Regression (MLR). However, careless on estimating the parameters of MLR may lead to producing biased classifier which inappropriate for future classification. Besides, the used on huge number of features for classification sometimes hinder MLR to perform well. This paper outlines a new idea for estimating the unknown MLR parameters with less number of features using a double-stage features extraction based on MLR (DSFE-MLR). The proposed DSFE-MLR extracted 39-MFCC from speech waveform and constructed an MLR using training set. Next, the MLR output of class membership probabilities were further extracted through MLR and evaluated using test set. Empirical evidence on Malay sample of students shows that the DSFE-MLR recorded the highest accuracy compared to other classifiers. Besides, the method is able to recognize each of five Malay vowels correctly. In general, DSFE-MLR provides an increment of accuracy for Malay speech recognition
    corecore